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Abstract Gene regulatory networks are collections of 
genes that interact with one other and with other sub- 
stances in the cell. By measuring gene expression over 
time using high-throughput technologies, it may be pos- 
sible to reverse engineer, or infer, the structure of the 
gene network involved in a particular cellular process. 
These gene expression data typically have a high dimen- 
sionality and a limited number of biological replicates 
and time points. Due to these issues and the complex- 
ity of biological systems, the problem of reverse engi- 
neering networks from gene expression data demands a 
specialized suite of statistical tools and methodologies. 
We propose a non-standard adaptation of a simulation- 
based approach known as Approximate Bayesian Com- 
puting based on Markov chain Monte Carlo sampling. 
This approach is particularly well suited for the in- 
ference of gene regulatory networks from longitudinal 
data. The performance of this approach is investigated 
via simulations and using longitudinal expression data 
from a genetic repair system in Escherichia coli. 
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1 Introduction 

The development of high-throughput technologies, such 
as microarrays and next-generation sequencing, has en- 
abled large-scale studies to simultaneously assay the 
expression levels of thousands of genes over time. How- 
ever, in spite of the abundance of data obtained from 
these technologies, it can be very difficult to unravel the 
patterns of expression among groups of genes, often re- 



ferred to as gene regulatory networks (CRN; [Friedman, 
2QQ4[ [Wilkinson, 2QQ9D . Within CRN, genes interact 
with one another indirectly through proteins known as 
transcription factors (TF), which control the transfer 
of information during transcription by activating or re- 
pressing a ribonucleic acid (RNA) polymerase, and thus 
affect the level of gene expression (Figure [l]). A CRN 
can thus be described as the interactions that occur 
(indirectly through messenger RNA and TF) within a 
collection of interconnected genes. 

In longitudinal studies of gene expression, the num- 
ber of samples (i.e., biological replicates or time points) 
collected is typically far outweighed by the number of 
observed genes. Because of the exponentially large num- 
ber of possible gene-to-gene interactions, reverse engi- 
neering CRN from longitudinal studies actually ampli- 
fies the "large j9 small n" paradigm. In addition, because 
gene-to-gene interactions coincide with reactions in the 
cellular environment, the network structure can itself be 
very complex. For this reason, standard statistical tech- 
niques cannot be used to infer CRN from gene expres- 
sion data, and a specialized suite of statistical method- 
ologies have been developed. Among these methods, 
a framework known as Dynamic Bayesian Networks 
(DBN) has seen wide application in the context of GRN 



(Husmeier, 2003 


Rangel et al., 2004 Deal et al., 2005 


Rau et al., 2010 


). A DBN uses time-series measure- 
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Fig. 1 A simple gene regulatory network made up of four 
genes. Each gene is transcribed and translated into a tran- 
scription factor (TF) protein, which in turn regulates the ex- 
pression of other genes in the network by binding to their 
respective promoter regions ( jSchlitt and Brazma, 2007| . The 
gene regulatory network may be represented using the graph 
in lower right corner, made up of four nodes (genes) and five 
edges (interactions among the genes). Image taken from ,Rau| 
|(2010| ) 



ments on a set of random variables to characterize their 
interactions over time. To avoid an explosion in model 
complexity, a time-homogeneous Markov model is typ- 



ically used ( jHusmeier et al., 2005 ). This restriction im- 
plies that gene-to-gene interactions are constant across 
time and that biological samples are taken at equidis- 
tant time points. 

Within the framework of DBN, the Bayesian paradigm 
is particularly well-suited to the inference of GRN for 
a number of reasons. First, the number of possible net- 
work structures increases exponentially as the num- 



ber of genes increases (Husmeier et al., 2005). As a 



large number of network structures may yield similarly 
high likelihoods, attempting to infer a single globally 
optimal structure may be meaningless. In such cases, 
posterior distributions of gene-to-gene interactions may 
better characterize a GRN. Second, by examining the 
shape of the posterior distributions within portions of 
a GRN, additional information may be gleaned about 
the structure and inferability of specific gene-to-gene in- 
teractions, as well as the system as a whole. Finally, a 
Bayesian framework allows a priori knowledge to be en- 
coded in the prior distribution structure. Prior knowl- 
edge may refer to certain features of the topology of 
a GRN (e.g., sparsity in the network structure or the 
maximum number of regulators per gene) and to prior 
biological information about well-characterized path- 
ways from bioinformatics databases. 

Unless restrictive assumptions are made about the 
dynamics of the system (e.g., Gaussian prior distri- 
butions for gene-to-gene interactions), the likelihood 
function of a GRN may be intractable or diflficult to 
calculate. In such cases, sampling-based approximate 



inference to be adopted (Pritchard et al., 1999 Beau- 



mont et al., 2002 Marjoram et al., 2003) when simu- 



lation from the model is straightforward. The first im- 
plementation of an ABC algorithm was introduced by 
[Pritchard et al. (1999) . In this approach, using parame- 
ter values simulated from a prior distribution, data are 
simulated and compared to the observed data. When 
the simulated and observed data are sufficiently "close" , 
as determined by a distance function p(-) and tolerance 
e, the parameter values are accepted (Beaumont et al.. 



2002). The algorithm is approximate when e > 0, and 



its output amounts to simulating from the prior when 
e ^ oo. For < e < oo, the algorithm results in a 
sample of parameters from an approximate posterior 
distribution. 

Because a naive application of ABC methods can 
be time-consuming and inefficient, a variety of exten- 
sions have been proposed in recent years. For high- 



dimensional data, Beaumont et al. (2002) found that 
using summary statistics to compare simulated and ob- 
served data, rather than the data points themselves, en- 
ables a reduction of the data without negatively impact- 
ing the approximation. Several adaptations of ABC al- 
gorithms have also been proposed based on Monte Carlo 
techniques. For instance. Marjoram et al. (2003) ex- 
tended the ABC algorithm to work within the Markov 
chain Monte Carlo (MCMC) framework without the 
use of likelihoods. In this approach, which we refer to 
as ABC-MCMC, parameters are proposed from a tran- 
sition distribution (e.g., a random walk) and subse- 



quently used to simulate data. Sisson et al. (2007) used 
a Sequential Monte Carlo technique (SMC-ABC) to 
propagate a population of parameters through a se- 
quence of intermediary distributions to obtain a sample 
from the approximate posterior distribution. In related 
work Beaumont et al. (2009[ ) applied an adaptive se- 
quential technique known as Population Monte Carlo 
(PMC) to the general ABC algorithm to improve its ef- 
ficiency through iterated importance sampling. Further 
recent implementations of ABC algorithms can also be 
found in, e.g., Leuenberger and Wegmann (2009D and 



Drovandi and Pettitt (2010) 



In this work, we propose an extension of the ABC- 
MCMC algorithm to enable the inference of GRN from 
time-course gene expression data. Our approach en- 
ables Bayesian inference without restrictive assump- 
tions about the distribution of gene-to-gene interac- 
tions within a network. The resulting approximate pos- 
terior distributions of interactions within the network 
have the added advantage of providing salient informa- 
tion about the inferability of the biological system as a 
whole. Although there have been some recent develop- 
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merits in network inference using ABC methods, e.g., 
to compare the evolution of protein-protein interaction 
networks (Ratmann et al., 2007 2009) and to conduct 



Wilkinson, 2009) have found that simple, linear mod- 



model selection for systems based on ordinary differ- 



ential equations (Toni et al., 2009 Toni and Stump f. 



2010), to our knowledge this is the first application 



of ABC methods in reverse engineering the unknown 
structure of a gene regulatory network from gene ex- 
pression data. 



2 Approximate Bayesian Computation for 
Networks 

Let y be a set of observed gene expression data for 
P genes at T equally- spaced time points, where yt = 
{yit^ • ' ' ^ypt) represents the gene expression measure- 
ments at time t. In this work, we consider two related 
characterizations of a GRN: an adjacency matrix G, 
and a parameter matrix O. For the former, let G be 
a P X P matrix such that Gij = 1 if gene j regulates 
gene z, and Gij = otherwise. For the latter, we define 
as the P X P parameter matrix of a GRN, where 
Oij represents the relationship between gene j at time 
t — 1 and gene i at time t. For this matrix, a value of 
Oij = indicates that gene j does not regulate gene i; 
if 6ij > {6ij < respectively), gene j activates (re- 
presses) gene i. Note that P(% = 0\Gij = 0) = 1, and 
P{Oij = 0\Gij = 1) = 0. We win further discuss the si- 
multaneous use of the matrices and G in Section |231 
Our objective in this work is to determine which 
gene-to-gene interactions within the GRN may be in- 
ferred, based on their approximate posterior distribu- 
tions. To accomplish this, we first introduce the Bayesian 
model used to model the time-course gene expression 
data Y and corresponding gene regulatory network 0. 
After motivating the use of ABC methods in this con- 
text, we then introduce the ABC-MCMC algorithm of 



Marjoram et al. (2003) in greater detail, and describe 



our modifications for reverse engineering GRN. 

2.1 Bayesian Model 

2.1.1 Likelihood specification 

For a given gene regulatory network 0, we model the 
time-course gene expression data as a time- homogeneous 
Markov model 



Y r.l[f{yt;yt_u0)■ 



with the convention that yo = 0. Several authors (e.g., 
Beal et al., 2005[ |Opgen-Rhein and Strimmer, 2007 



els can in some cases yield good approximations of the 
dynamics occurring within complicated biological sys- 
tems. To this end, one simple yet effective choice for 
the density / in Equation ([T]) is a first-order vector au- 
toregressive (VAR(l)) model: 



= 0yt-i + 



(2) 



where e^ is an error term satisfying E{et) = 0, E{ete'f.) = 
i7 (a P X P positive definite covariance matrix), and 



^(e^e^,) = 0. In previous work (e.g., Beal et al., 2005 



Rau et al., 2010), the errors e^ have additionally been 
assumed to follow a normal distribution, et ~ A^(0, E). 
In this work, we do not impose any particular form for 
the distribution of the errors e^ beyond the assumptions 
on the first two moments previously mentioned. 

2.1.2 Network Prior Distributions 

To fully define the Bayesian model used for y, we must 
also specify the prior distributions for the adjacency 
matrix G and parameter matrix 0, 7r{G) and 7r{0\G). 
In a GRN, as the number of genes (P) in a network in- 
creases, the number of possible interactions within the 
network quickly increases (P x P). As a large number 
of genes may interact simultaneously with one another 
in very sophisticated regulatory circuits, the network 
topology itself may be quite complicated. Even so, cer- 
tain properties of biological networks can be useful in 
limiting the support of the prior distribution to realistic 
network topologies. In particular, most genes are reg- 



ulated just one step away from their regulator ^Alon, 



2007), and gene networks tend to be sparse, with a lim- 



ited number of regulator genes (Leclerc, 2008). 



In keeping with these biological hypotheses, we elect 
to use uninformative prior distributions with some re- 
strictions for both 7r(G) and 7r{0\G). We restrict the 
number of regulators for each gene (referred to as the 
fan-in for each gene in the network). Because GRN are 
known to be sparse, we choose the prior on the ad- 
jacency matrix, 7r(G), to be uniform over all possible 
structures, subject to a constraint on the maximum 
fan-in for each gene in the network, as has been sug- 



gested (Friedman, 2000 Husmeier, 2003 Werhli and 



'Husmeier, 2007). This restriction is supported by the 



biological literature, as genes do not tend to be syn- 
chronously regulated by a large number of genes ( [Leclerc , 
2008). For the parameter prior 7r{6ij\Gij = 1), we use 



a uniform distribution, where the bounds are chosen to 
represent a realistic range of interaction magnitudes in 
GRN. In this work, we use bounds of -2 and 2 for all Oij^ 
as these correspond to strong repression and activation 
effects, respectively. 
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2.2 ABC Motivation 

Given the likelihood and prior distributions defined in 
the previous section, our goal is to reverse engineer a 
GRN from observed expression data Y via the posterior 
distribution 

7r(0,G|y)cx/(F|0)7r(0|G)7r(G). 

In some cases, the error term included in the like- 
lihood in Equation ^ is assumed to follow a well- 
known distribution (e.g., a Normal distribution). This 
hypothesis would enable straightforward calculation of 
the likelihood, and in turn, the posterior distribution 
7r(0|y), whether through explicit calculation or a stan- 
dard MCMC sampler. However, in this work we do not 
impose a specific distributional form for e^, and as such, 
the likelihood f{Y\0) cannot be evaluated. It is exactly 
in situations such as this that ABC methods have been 
successfully developed and applied in recent years. 



Simple ABC rejection methods (e.g., Pritchard et al. 
|(1999| )) have the advantage of being easy to code and 
generating independent observations, but can be ex- 
tremely time consuming and inefficient, particularly in 
the case of GRN. To illustrate, we applied the follow- 
ing simple ABC rejection method to sample from the 
approximate posterior distribution 7t{0^G\Y): 

1. Generate G and from 7r(G) and 7t{0\G)^ respec- 
tively. 

2. Generate one-step- ahead predictors from model 
[2j given yt-i and 0^ (see Section 2.4 for a discussion 
of this simulation strategy). 

3. Calculate the distance p{Y,Y^) between Y and Y^ . 

4. Accept {0'^,G'^) if p < e, where e is chosen as de- 
scribed in Section 



Using this algorithm, only 5 proposed networks (0^, G'^) 
are accepted out of a total of 1 x 10^ proposals (data not 
shown). Because such an approach is both inefficient 
and unpractical, we focus instead on the ABC-MCMC 



Simulated and observed data are compared using a dis- 
tance function p(-) and tolerance e, and proposed pa- 
rameters are accepted with probability 



a 



mm 



<e) 



where l(-) is an indicator function that replaces the 
likelihood, and 7r(-) represents the prior distributions of 



(0, G). Under suitable regularity conditions (Marjoram 



et al., 2003), it is straightforward to show that the sta- 



tionary distribution of the chain is indeed the approx- 
imate posterior distribution. If e is sufficiently small, 
then this distribution will be a good approximation to 
the true posterior distribution 7t{0,G\Y). However, a 
balance must be achieved between a small enough tol- 
erance to obtain a good approximation to the poste- 
rior and a large enough tolerance to allow for feasible 



computation time. Bortot et al. (2007 ) proposed a fur- 
ther adaptation of ABC-MCMC for the purpose of im- 
proving its mixing properties using data augmentation 
techniques, known as the ABC-MCMC augmented al- 
gorithm. Specifically, the parameter space is augmented 
with the tolerance e, which is treated as a model param- 
eter with its own pseudo-prior distribution. Although 
this algorithm alleviates the problem of insufficient mix- 
ing, since larger values of e may be accepted, it typi- 
cally requires a much larger number of iterations than 
the original ABC-MCMC algorithm. 

Adapting the ABC-MCMC algorithm of |Marjorani| 
et al. (2003 ) to the context of GRN requires two impor- 



approach of Marjoram et al. (2003). 



tant considerations to be taken into account: 1) com- 
putationally efficient methods for simulating data Y^ 
from a known GRN (defined by its parameter matrix 
0"^), and 2) an appropriate proposal distribution ^(-l-) 
for both the network structure and parameters. We re- 
fer to the algorithm incorporating these adaptations as 
the ABC for Networks (ABC-Net) method. For clarity, 
although we limit this discussion to data with a single 
biological replicate, the extension to multiple replicates 
is straightforward. 



2.3 ABC Markov Chain Monte Carlo for Networks 



The ABC-MCMC algorithm (Marjoram et al., 2003) 



makes use of the standard Metropolis-Hastings scheme 



(Hastings, 1970) to obtain samples from the approx- 



imate posterior distribution 7r(0, G|p(F*, F) < e). To 
accomplish this, matrices 0* and G^ are proposed based 
on a proposal distribution ^(-l-) and subsequently used 
to simulate data Y^ based on a given model f{-\0'^). 



2.4 Simulating Data for Gene Networks within ABC 

One of the most important considerations in adapting 
the ABC-MCMC algorithm to the inference of GRN 
is identifying an efficient simulator for proposed net- 
work parameter matrix 0^. Broadly, we simulate gene 
expression at time t as a function of gene expression 
at the previous time point and the proposed parame- 
ter matrix 0* using a VAR(l) model as in Equation 
[2j Specifically, after setting = yi, we exploit the 
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Markov property of the VAR(l) model to obtain one- 
step-ahead predictions (i.e., fitted values) of gene ex- 
pression at time points t = 2, . . . , T: 



(3) 



Note that the one-step-ahead predictions for are made 
using the observed data y^-i, and not the simulated 
data yt_i. That is, we simulate data deterministically 
by calculating the expected value of gene expression at 
each time point given the network structure and ob- 
served expression values at the previous time point, 
rather than incorporating an estimate of noise in the 
simulated data. 

We are aware that the deterministic simulation pro- 
cedure discussed above is somewhat unconventional in 
the ABC literature, primarily since it does not incor- 
porate an estimate of noise in the simulated data Y^. 
More classically, repeated sampling is used to control 
the variability of the data by simulating several noisy 
datasets {Y^^...^Y^} for a given network 0^ ^ with 
M > 1. Keeping this in mind, adding no noise can be 
seen as the limiting case of M > 1 replications of the 
dataset generated from the same 0, as advocated in 



some ABC procedures (Del Moral et al., 2009). In our 



case, the choice to use the one-step-ahead predictors as 
in Equation ([3| is a practical one. More specifically, be- 
cause the time-series expression data are modeled as a 
VAR(l) process, we found that adding noise at early 
time points simply had the effect of inducing wide dis- 
crepancies at later time points, as incorrect error terms 
compounded throughout the simulated time series. This 
had the effect of creating large distances p(F, F*), even 
when the true network was used to generate Y*. 

Finally, the appropriateness of using a VAR(l) sim- 
ulator. Equation (|3|, is largely dependent on the noise 
present in observed data, as well as the adequacy of 
the assumption of time- invariant, first-order autoregres- 
sive dynamics for complicated GRN. In the absence of 
more detailed information about the underlying net- 
work, it may be reasonable to use a simple model such 
as the VAR(l) to generate simulated data. We note 
that the ABC-Net algorithm has the flexibility to in- 
corporate arbitrary models as data simulators, provided 
they are computationally efficient. For instance, in some 
cases second-order models, nonlinear models, linear dif- 
ferential equations, draws from a Dirichlet process, or 
Michaelis-Menten kinetics may more aptly describe the 
dynamics of a particular GRN; in these cases, the ap- 
propriate simulator model would be used in place of 
Equation 



2.5 Two-Step Network Proposal Distributions 

Another important consideration is the proposal distri- 
bution q{-\-) that defines the transition from the current 
proposal for a GRN to an updated proposal. Based on 
the current values of G and 0, a two-step proposal dis- 
tribution is used to produce new samples and 0^ 
for the adjacency and parameter matrices, respectively. 
In this context, the adjacency matrix G may be viewed 



as an auxiliary variable (Damien et al., 1999), which is 



introduced to simplify the Markov chain Monte Carlo 
algorithm. As such, the joint distribution of G and 
may be seen as a completion of the marginal density of 



(Robert and Casella, 2004) which facilitates simula- 



tion within the MCMC algorithm. 



In the first step, one of three basic moves (Husmeier 



et al., 2005P is applied to the current adjacency matrix 
G*: adding an interaction (i.e., changing a to a 1), 
deleting an interaction (i.e., changing a 1 to a 0), or 
reversing the direction of an interaction (i.e., if Gij = 1 
and Gji = 0, exchanging these two values). If M{G) rep- 
resents the neighborhood size of a particular adjacency 
matrix G, (i.e., the number of other network structures 
that can be obtained by applying one of these three ba- 
sic moves), the transition probability of the first step is 
given by g(G^|ff ) = 1/m\g'). 

In the second step, the proposal distribution of 0, 
given the current value 0* and the updated adjacency 
matrix G*, is defined to be 



ro ifG*=0 



(4) 



where cjq is the variance of the proposal distribution, 
and CF0 may be tuned to obtain an empirical acceptance 
rate between 15% and 50%, as recommended in Gilks 



et al. (1996). A simple example of the two-step proposal 
distribution for GRN is shown in Figure |2j 

It is worth noting that the introduction of the ad- 
jacency matrix G is not strictly necessary to accom- 
plish the two-step proposal described above. For in- 
stance, it would be straightforward to define the tar- 
get with respect to a mixture of singular measures, 



e.g., a dirac mass and a Gaussian density (see Gottardo 



and Raftery, 2004, for more details). Furthermore, the 
three proposal moves (add, delete, and reverse a net- 
work edge) could be defined using a mixture of ker- 
nels that include selection probabilities depending on 
the current state. Our primary motivation for includ- 
ing G is based on the approach for learning Bayesian 



networks in Chapter 2 of Husmeier et al. (2005), which 



clearly distinguishes the network structure (i.e., the set 
of edges and nodes represented by the adjacency matrix 
G) from the network parameters (the matrix 0). The 
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Fig. 2 Example of two-step proposal distribution for GRN. 
Top row: A network in iteration i of the ABC-Net algorithm 
may be characterized both by its adjacency matrix (left) 
and its parameter matrix (right). The former encodes only 
the presence (1) or absence (0) of an interaction. The latter 
encodes additional information about the magnitude of a par- 
ticular interaction, where zeros indicate that an interaction is 
not present, positive values indicate an activation, and nega- 
tive values indicate a repression (interactions with values fur- 
ther away from zero correspond to stronger effects). Bottom 
row: An updated network is proposed by adding, deleting, 
or reversing an interaction in to produce G^ (left). The 
parameter matrix 0* is updated using a Gaussian proposal 
distribution for the nonze ro interactio ns of G^ to produce 
(right). Image taken from |Rau (2010| ) 



three-move proposal strategy we apply for the structure 
of the network is based on this intuitive representation, 
and is rather popular in Bioinformatics (see e.g.,[Hus^ 
meier et al., 2005). 



of accepted parameters as the number of iterations in- 
creases. This cooling scheme also addresses the poor 
mixing often observed in the ABC-MCMC algorithm, 
as larger tolerances in the early iterations of the burn-in 
are associated with higher acceptance rates. A total of 
200 iterations are run for each of ten cooled threshold 
values, and the burn-in period is repeated if the em- 
pirical acceptance rate is less than 1%. This ensures a 
minimum burn-in period of 2000 iterations, with ad- 
ditional iterations included for chains affected by poor 
mixing. 

Because the ABC-Net algorithm relies on a com- 
parison between simulated and observed data to avoid 
a likelihood calculation, long chains are required to en- 
sure the adequacy of the approximation. Although a 
single long chain could be run, it is also possible to 
run multiple overdispersed chains. In practice, we run 
10 independent chains of length 1 x 10^ simultaneously 
(rather than a single chain of length 1 x 10^). This ap- 
proach contributes a two- fold benefit, as calculations 
can be performed in parallel to improve computational 
speed and a convergence assessment can be conducted 



using the Gelman- Rubin statistic R (Gelman and Ru- 



bin, 19 92 ) . Following the recommendation in Gilks et al. 



(1996) we declare chain convergence if R < 1.2 for 



all parameters in 0. After the chains have converged, 
draws corresponding to the smallest 1% of the distance 
criterion are retained for inference. 



2.6 ABC-Net Implementation 

The output from the ABC-Net algorithm consists of 
dependent samples from the stationary distribution of 
the chain, /(0, G|p(F*, F) < e). In practice, because 
saving all iterations from the MCMC run can take up 
a large amount of storage (particularly as the size of 
the network increases) and consecutive draws tend to 
be highly correlated, we thin the chain at every 50*^ 
iteration. Additionally, as with many MCMC methods, 
a burn-in period is implemented to reduce the impact of 
initial values and to improve mixing for the chain. The 
length b of the burn-in depends on the starting values 
of the chain, and G^, the rate of convergence of the 
chain, and the similarity of the transition mechanism 
of the chain to the approximate posterior distribution. 
We follow the suggestion of Geyer (1992 J , setting b to 
between 1% and 2% of the run length n. 

We also implement a "cooling" procedure during the 
burn-in period similar to that used in |Ratmann et al.| 
|(2007| ), where acceptance of (G"^, 0'^) is controlled by a 
decreasing sequence of thresholds, until the minimum 
pre-set value e is reached. Note that tempering the ac- 
ceptance threshold e in this way reduces the number 



3 Simulation Study Based on the Raf Pathway 

In this simulation study, we focus on four specific as- 
pects related to the performance of the ABC-Net: the 
distance function p and tolerance e, the sensitivity to 
prior distribution bounds, the suitability of the model 
used to generate simulated data when more complicated 
dynamics are at play, and the effect of increasing the 
amount of noise present in the observed data. To do 
so, we focus on the Area Under the Curve (AUG) of 
the Receiver Operating Characteristic (ROC) curve as 
an indicator of performance, as well as qualitative ex- 
aminations of the approximate posterior distributions 
of interactions in the network. To calculate the AUC , 
we retain only the samples corresponding to the small- 
est 1% of distances p{Y'^^Y) for inference. Based on 
these samples, we calculate the bounds of the a% cred- 
ible intervals for each gene-to-gene interaction, where 
a = {1, . . . , 100}. If the a% credible interval for a par- 
ticular interaction does not contain 0, the gene-to-gene 
interaction is declared to be present; otherwise, the in- 
teraction is declared to be absent. In this way, because 
the simulation setting determines which interactions are 
truly present and absent, true positives, false positives. 
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Fig. 3 T he currently acce pted gold-standard Raf signalling 
pathway (|Werhli and Husmeier, 2007), which describes the 
interactions of eleven phosp horylated proteins in primary hu- 
man immune system cells ( [Sachs et al., 20051 - Nodes rep- 
resent the proxy genes of each of the eleven proteins (i.e., 
the genes that are transcribed and translated into the corre- 
sponding proteins), and arrows i ndicate th e direction of signal 
transduction. Image taken from |Rau (2010| ) 

true negatives, and false negatives may be calculated for 
each a, and the AUG may subsequently be calculated. 
The values of a may be adjusted for multiple testing, 
if necessary. 

3.1 Simulation Design 



Area Under the Curve (By Distance Function and Threshold) 




1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10% 
Canberra Euclidean Manhattan MVT 

Distance Function and Threshold 



Fig. 4 Area Under the Curve (AUG) of the Receiver Oper- 
ating Characteristic (ROC) curve for four choices of distance 
functions in the ABC-Net algorithm: Canberra, Euclidean, 
Manhattan, and MVT distances. Black dots represent the 
value of the AUC for each of five independent datasets per 
threshold and distance function (with the exception of the 
MVT distance, which was limited to two datasets due to its 
computational burden). The threshold e was set at the 1%, 
5%, and 10% quantiles from 5000 randomly ge nerated net - 
works. Blue lines represent loess curves ( [Cleveland, 1979| ). 
Image taken from |Rau (2010[ ) 



Rather than defining an arbitrary network 0, we in- 
stead make use the structure of a well-characterized 
pathway in human immune system cells involving the 
Raf signalling protein (Sachs et al., 2QQ5| . We generate 
data based on 11 genes, where the adjacency matrix 
QRa^ defined using the structure of the currently ac- 
cepted Raf signalling network (Figure |3|. If an interac- 
tion is present from gene j to gene i, we sample 6^^^ 
uniformly from the interval (—2, —0.25) U (0.25, 2), and 
otherwise 0^^^ = 0. The bounds for non-zero gene-to- 
gene interactions were chosen to represent a range of 
moderate to strong interactions among genes. We gen- 
erate one replicate of expression data for each of the 11 
genes over 20 time points, using the VAR(l) model 



yt = e^'^Vt-i + zt 



(5) 



for t = 1,...,T, where yi - N{0,I), and Zt - N{0,(t^). 
For each simulation, unless otherwise noted, the noise 
standard deviation is set to a = 1, the Gaussian pro- 
posal standard deviation in Equation Q is set to ao = 
0.5, and the maximum fan-in is constrained to 5 or less. 



3.2 Choice of p and e 

The distance function p and threshold e are essential 
components to the ABC-Net method, as they directly 
affect the probability that simulated data generated 



by a network are accepted as being "close enough" 
to the observed data. Although there are many po- 
tential options for this distance function, we focus on 
a comparison among the Manhattan, Euclidean, Can- 



berra, and Multivariate Time-Series (MVT; [Lund and 



Li, 2009) distances (see Appendix). For each choice 



of p, we propose a heuristic method where 5000 ran- 
domly generated networks are used to simulate data, 
and the corresponding distances p{Y^^Y) are calcu- 
lated for each. Subsequently, e is set to be either the 
1%, 5%, or 10% quantile of these distances associated 
with 5000 randomly generated networks. The number 
of randomly generated networks was chosen based on 
a set of preliminary simulations that indicated that 
the quantiles for the corresponding distances p{Y'^^Y) 
seemed to stabilize for 5000 or more networks (data not 
shown). For larger networks, further exploratory simu- 
lations may need to be performed to ensure that this 
number is not too small. Each combination of p and 
e was repeated over five independent datasets in order 
to include an assessment of their variability (only two 
datasets were simulated for the MVT distance due to 
its computational burden). 

Each distance function under consideration calcu- 
lates and penalizes differences between simulated and 
observed data in a different way. In particular, the be- 
havior of the MVT function appears to differ from that 
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Convergence Assessment by Prior Bounds 



Convergence cutoff (= 1 .2) 



i 



12 3 4 

(-2,2) 



1 2 3 4 5 

(-3,3) 



12 3 4 

(-5,5) 



12 3 4 
(-10,10) 



Prior Bounds and Replicate Number 



Fig. 5 Gelman-Rubin statistics (R) for each rephcate of four 
choices of bounds on the prior distribution 7r{0\G). The black 
dotted line indicates a value of R = 1.2, the cutoff at which 
convergence is declared among ten independent chains in the 
ABC-Net algorithm. Image taken from |Rau (2010| ) 



of the other distance functions, with much lower AUG 
values for each combination of p and e (Figure [4| . The 
Canberra, Euclidean, and Manhattan distances all ap- 
pear to be on par with one another, particularly when 
e is set at the 1% quant ile of distances. However, based 
on the criterion of AUG alone, there does not seem to 
be strong evidence that favors one choice among the 
Canberra, Euclidean, and Manhattan distances, partic- 
ularly for a cutoff of e = 1%. That is, although the MVT 
distance is a poor choice of distance function within the 
ABC-Net algorithm, the remaining distances yield simi- 
lar results. Because it enjoys a slight advantage over the 
Manhattan and Canberra distances in terms of compu- 
tation time, we use the Euclidean distance with e set to 
the 1% threshold for the remainder of the simulations. 



highlights the need for well-chosen prior bounds for the 
inference of GRN. 

We also consider the effect of the choice of prior 
bounds on the shape of the approximate posterior dis- 
tributions in the network. In Figure [6j a graphical ma- 
trix of the marginal approximate posterior distribu- 
tions of each interaction in the network is given for 
prior bounds (-2,2). As may be expected, the approxi- 
mate posteriors are generally more diffuse when wider 
prior bounds are used. However, regardless of the choice 
of prior bound, some gene-to-gene interactions consis- 
tently have very fiat (diffuse) approximate posterior dis- 
tributions (e.g., those in the Pip3 column), while oth- 
ers tend to be consistently peaked (e.g., those in the 
Erk column) . We refer to gene-to-gene interactions with 
these two characteristics as "flexible" and "rigid", re- 
spectively. Interestingly, in this simulation, the most 
rigid interactions appear to correspond to regulators 
that are furthest downstream in the simulated pathway 
(Mek, Erk, and Akt), while those furthest upstream ap- 
pear to be the most flexible. In the context of the ABC- 
Net method, this suggests that rigid interactions (e.g., 
Mek^Erk) in 0^ must take on values within a tight 
interval in order to generate simulated data that 
are close (in terms of p and e) to the observed data Y. 
Conversely, flexible interactions (e.g., Pip3^Pip3) can 
take on values within a much wider interval without 
negatively affecting the proximity of simulated and ob- 
served data. Thus, it is likely that the model is most 
sensitive to parameters with narrow credible intervals 
(rigid interactions) and least sensitive to those that can- 
not accurately be localized (flexible interactions) by the 



approximate posterior distribution (Toni et al., 2009). 



That is, it appears that some interactions may intrin- 
sically be easy to infer even with relatively wide prior 
bounds, while others cannot be accurately determined 
even with reasonable prior distribution bounds. 



3.3 Sensitivity to prior distribution bounds 



3.4 Suitability of VAR(l) Simulator 



Although the prior bounds (-2 and 2) for 7r(0|G) are 
reasonable for the context of GRN, we also consider 
the following bounds: (-3,3), (-5,5), and (-10,10). These 
intervals include somewhat "unrealistic" values for 6), 
but are more diffuse (and hence less informative). The 
greatest effect of using less informative prior distribu- 
tions is in terms of the convergence of the ten indepen- 
dent chains, as assessed by the Gelman-Rubin statistic 



(Figure 3.2). This is most evident for prior bounds of 
(-10,10), where a large number of interactions exceed 
the convergence cutoff of 1.2 by a large amount. It is 
perhaps unsurprising that wider prior bounds lead to 
problems in chain mixing and convergence, and thus 



The applicability of the ABC-Net method to real GRN 
relies heavily on its ability to accurately simulate data 
for a given network structure. It is feasible that real 
biological systems do not follow a VAR(l) model, and 
in fact, that they arise from very complicated, nonlin- 
ear relationships. To assess how the ABC-Net method 
performs when observed data Y are actually gener- 
ated from more complicated models, we focus on four 
models (Table [T]): a first-order nonlinear VAR model 
(VAR-NL(l)), a second-order VAR model (VAR(2)), a 
second-order nonlinear VAR model (VAR-NL(2)), and 
an ordinary differential equation (ODE). For the VAR 
models, Of^^ and 6^2^^^ were each defined using the 
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Approximate Posterior Distributions 

Pip3 PIcy Pip2 pkc P38 Pka Jnk Raf Mek Erk 




-2 2 -2 2 -2 2 -2 2 -2 2 -2 2 -2 2 -2 2 -2 2 -2 2 -2 2 



Parameter value 

Fig. 6 The structure of the true Raf signalhng pathway, 0^^^, and a graphical matrix of the marginal approximate posterior 
distributions for every interaction in the network, with prior bounds (-2,2). Each element of the graphical matrix corresponds 
to the same element of 0^^^, i.e., the density in the second row and first column corresponds to 0^^^ (Pip3— ^Plc7). The x-axis 
of each plot represents the values of each parameter Ofj^^, and the y-axis represents the corresponding density. Black dotted 



lines are included on plots where Ofj^^ 7^ at the true value. Image taken from Rau (2010) 



structure of the Raf signalling network, where exist- 
ing interactions were sampled uniformly from the in- 
terval (-2,-0.25) U (0.25,2) and otherwise set to 0. 
For the ODE model, coefficients were randomly drawn 
from a U{ — 1^1) distribution and initial values for all 
genes were set to 1. After solving the ordinary differ- 
ential equations for time points t = 1, . . . , 20, random 
noise sampled from A^(0, 1) was added to each measure- 
ment at each time point. 

It is not surprising that the ABC-Net has the best 
performance in terms of AUG for the VAR(l) model, 
as the data Y are generated with the same model that 
is used to simulate Y"^ (Figure [7|. For the other simu- 
lator models, the performance of the algorithm notice- 
ably declines, with the lowest AUG values observed for 
the two second-order models, VAR(2) and VAR-NL(2). 
The nonlinear first-order VAR model shows wide vari- 
ability in its results, ranging from an AUG of just over 
0.40 to over 0.70. Of the alternative models, the ordi- 
nary differential equation appears to have the highest 



performance in terms of AUG. As a final note concern- 
ing the performance of the ABG-Net algorithm when 
alternative models are used to generate F, recall that 
the simulator described in Section |2^ has the flexibil- 
ity to incorporate alternative models, provided they are 
computationally efficient. In this respect, the VAR(l) 
model may be viewed as a kind of robust null model to 
apply when nothing is precisely known about the dy- 
namics of a particular system. However, in cases where 
other models are known to better fit a given set of data 
(e.g., a second order or non-linear model), the ABG-Net 
method can be adapted accordingly. 



3.5 Effect of Noise in Observed Data 

We expect that increasing amounts of noise in the ob- 
served data (i.e., a in Equation [5| lead to reduced per- 
formance for the ABG-Net algorithm, particularly since 
the VAR(l) simulator uses one- step ahead predictors to 
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Table 1 Alternative models used to generate observed data 
Y: an ordinary differential equation (ODE), a second-order 
VAR model (VAR(2)), a first-order nonlinear VAR model 
(VAR-NL(l)), and a second-order nonlinear VAR model 
(VAR-NL(2)) 



Area Under the Curve (By Observed Noise) 



Model 


Network Equations to Generate Y 


VAR-NL(l) 


yi = zi 

yt=0f^Vt"-i+zt, 

for t = 2,...,T 
Zt -iV(0,l) for t= 1,...,T 


VAR(2) 


yi = zi 

y2 = 0f ^Vi + Z2 

yt = 0f "Vt-i + 0^-'yt-2 + Zt, 

for t = 3,...,T 
Zt -iV(0,l) for t= 1,...,T 


VAR-NL(2) 


yi = zi 

y2 = ©f'^Vt"-! + Z2 

yt = 6)f "Vt"-i + 0^-'yt-2 + Zt, 

for t = 3,...,T 
Zt -iV(0, 1) for t= 1,...,T 


ODE 


^Pkc = 0.182/pic^ - 0.752/Pip2 

^Raf = -0.28ypkc + 0.62ypka 

^Mek = 0-63ypkc - 0.972/Raf - 0.52^pka 

^Erk = O-^O^Mek " 0.942/Pka 

^Pka = 0.31ypkc 

y'j^^^ = 0.28yErk + 0.602/Pka + 0.922/Pip3 

^P38 = -0.19ypkc - 0.322/Pka 

2/jnk = 0.242/Pkc + 0.982/Pka 
Vricy = 

Vpips = -0.282/pic^ 

^Pip2 = 0-832/P1C7 - 0.98ypip3 



Area Under the Curve (By Model) 



o 

ZD 
< 



VAR(1) VAR-NL(1) VAR(2) VAR-NL(2) ODE 
Model Type 

Fig. 7 Area Under the Curve (AUG) of the Receiver Op- 
erating Characteristic (ROC) curve for five difi"erent model 
choices to generate Y: VAR(l), VAR-NL(l), VAR(2), VAR- 
NL(2), and ODE. Black dots represent the value of the AUC 
for each of five independent datasets per bound. Image taken 
from , Rau (2010J 



simulate data based on a given network 0*. To evaluate 
this, we consider 

a = {0,0.1,0.25,0.5,0.75,1,1.5,2,3,5}, 




0.00 0.50 1.00 



1.50 2.00 3.00 
Observed Noise Standard Deviation 



Fig. 8 Scatterplots of the Area Under the Curve (AUC) of 
the Receiver Operating Characteristic (ROC) curve for the 
ABC-Net algorithm, with difi"ering values of noise standard 
deviation a (0, 0.1, 0.25, 0.5, 0.75, 1, 1.5, 2, 3, 5). Five datasets 
were generated for each value of nois e standard deviat ion. The 
blue line r epresents a loess curve ( [Cleveland, 1979] ). Image 
taken from [Rau (2010| ) 



where Zt ^ N{0,cr). The AUC results (Figure [8| in- 
dicate that the presence of increasing noise over the 
investigated range does seem to negatively affect the 
performance of the ABC-Net algorithm, although only 
for relatively large values of a (e.g., a = 5). As the noise 
standard deviation increases, it is not surprising that 
the performance of the algorithm deteriorates, since the 
one-step-ahead predictors fall increasingly further from 
the observed data (even when the true network is used). 

We also examine the approximate posterior distri- 
butions of the network for two different values of noise 
standard deviation, a = 0.5 and a = 5 (Figure |9|. 
For the most part, posterior distributions for both a = 
0.5 and a = 5 seem to have the same general shape, 
with some occasional discrepancies (e.g., Akt^Akt and 
Akt^Erk). In addition, as in previous simulations, we 
note once again the marked difference in posterior dis- 
tributions between rigid interactions (peaked distribu- 
tions) and flexible interactions (diffuse distributions). 
Regardless of the amount of noise incorporated into the 
simulated data for the Raf signalling pathway, the ap- 
proximate posterior distributions for the upstream and 
downstream portions of the network are consistently 
flexible and rigid, respectively. This seems to indicate 
that some interactions are intrinsically easier to infer 
(even in the presence of increased noise), while oth- 
ers cannot be accurately determined regardless of the 
amount of noise in the data. As such, the flexibility and 
rigidity of interactions in a given system likely plays an 
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Approximate Posterior Distributions 

PIcy Pip2 pkc P38 Pka Jnk Raf Mek Erk Akt 




■2 2-2 2 



Parameter value 

Fig. 9 The structure of the true Raf signaUing pathway, 0^^^, and a graphical matrix of the marginal approximate posterior 
distributions for every interaction in the network, for u — 0.5 and 5. Each element of the graphical matrix corresponds to the 
same element of 0^^^, i.e., the density in the second row and first column corresponds to B^^^ (Pip3— ^Plc7). The x-axis of 
each plot represents the values of each parameter ^J^-^^, and the y-axis represents the corresponding density. Red and blue lines 



correspond to results obtained with a — ^ and cr 
at the true value. Image taken from |Rau (2010| ) 



0.5, respectively. Black dotted lines are included on plots where ^J^^^ 7^ 



important role in the global inferability of the network 
structure. 



4 Application to S.O.S. DNA Repair System in 
Escherichia coli 

The S.O.S. DNA repair system in the Escherichia coli 
bacterium is a well-known gene network that is respon- 
sible for repairing DNA after damage. The full network 
is made up of about thirty genes working at the tran- 
scriptional level. The behavior of these genes in the 
presence of DNA damage has been well characterized 



(Ronen et al., 2002^. Specifically, under normal condi- 



tions a master repressor called lexA represses the ex- 
pression of the genes responsible for DNA repair. How- 
ever, when one of the S.O.S proteins (recA) senses DNA 
damage by binding to single-stranded DNA, it becomes 
activated and provokes the autocleavage of lex A. The 



subsequent drop in the levels of lexA suspends the re- 
pression of the S.O.S. genes, and these genes become 
activated. Once DNA damage has been repaired, the 
level of recA drops, which allows lex A to reaccumulate 
in the cell and subsequently repress the S.O.S. genes. 
At this point, the cells return to their original state. 
Although the network itself is quite small, its simple 
structure allows the cell to react in very sophisticated 
ways to conditions within the cell. 



4.1 Data 

We focus on a sub- network within the S.O.S. DNA re- 
pair system made up of eight genes: uvrD, lex A, umuD, 
recA, uvrA, uvrY, ruvA, and polB. Using green fluo- 



rescent protein (GFP) reporter plasmids, Ronen et al. 
,(2QQ2| ) measured the expression of these eight genes at 
fifty time points (every six minutes following ultravio- 
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Fig. 10 Results for the S.O.S DNA repair system for the EBDBN and ABC-Net methods. Blue and red solid edges in the graph 
represent gene-to-gene interactions identified by the EBDBN method that are "true positives" and "false positives," according 
to the known behavior of genes in the S.O.S. network. Dotted gray lines represent gene-to-gene interactions supported by the 
literature that are not identified by the EBDBN method. Blue-filled densities represent the marginal approximate posterior 
distributions found through the ABC-Net method. The feedback loops on the S.O.S. genes (uvrD, uvrY, ruvA, and polB) 
appear to be flexible, while others exhibit greater rigidity. Image taken from ^Rau (2010, ) 



let irradiation of the cells to provoke DNA damage). 
The quantity of GFP is proportional to the quanti- 
ties of the corresponding S.O.S. proteins, which are in 
turn proportional to the corresponding mRNA produc- 
tion rates (jPerrin et al., 2003). As such, it is reason- 



able to assume that the data of Ronen et al. (2002) di- 
rectly indicate the expression levels of each of the S.O.S. 
genes. These data are available at the authors' web- 
site (http://www.weizinann.ac.il/incb/UriAloii). In 



addition, the study performed by Ronen et al. (2002) 
consisted of two different experiments for each of two 
different intensities of ultraviolet light (Experiments 1 
and 2 at 5 Jm~^, and Experiments 3 and 4 at 20 
Jm~^). One recent study by Charbonnier et al. (2010) 
found that Experiments 1 and 4 systematically led to 
poor results for network inference methods, although 
nothing should distinguish them from the other two ex- 
periments. As such, we focus the rest of our discussion 
on the data collected in Experiment 3, which was mea- 
sured with the higher level of ultraviolet light. 



4.2 Analysis 

In addition to the ABC-Net method, we apply the Em- 
pirical Bayes Dynamic Bayesian Network (EBDBN) ap- 
proach of Rau et al. (2010] ) with a hidden state di- 
mension of K = 0, where a 99.9% cutoff is used as 
a threshold for the z-scores of the gene-to-gene inter- 



actions. This particular method is chosen to illustrate 
the benefit of using the ABC-Net approach in tandem 
with other inference methods. As before, we set the 
Gaussian proposal standard deviation in Equation Q 
to (70 = 0.5, and we ran the ABC-Net method for ten 
independent chains of length 1 x 10^, with a thinning 
interval of 50. The VAR(l) simulator is used to generate 
simulated data F"^, and the prior bounds of 7r(0|G) are 
set to (-2,2). We used the Euclidean distance function, 
where the threshold e is selected using the previously 



described heuristic method (Section 3.2), based on the 



1% quantile of distances for 5000 random networks. Due 
to the small size of the network, the maximum fan-in 
was constrained to 2 or less. 

The gene-to-gene interactions identified by the EBDBN 
method are illustrated in Figure p^O} where blue and red 
solid edges represent "true positives" and "false pos- 
tives," according to the previously described behavior 
of the S.O.S. network. We use these terms somewhat 
loosely, because even for well-understood networks such 
as the S.O.S. DNA repair system, the absence of a par- 
ticular gene-to-gene interaction in the literature can- 
not indicate with absolute certainty that such a rela- 
tionship is absent. Gray dotted lines represent gene-to- 
gene interactions supported by the literature that are 
not identified by the EBDBN method. We also exam- 
ine the marginal approximate posterior distributions for 



each of these interactions (Figure 10), as obtained by 



Reverse Engineering Gene Networks Using ABC 



13 




Fig. 11 Interactions exhibiting the highest rigidity in the 
S.O.S DNA repair system for the ABC-Net method. Dot- 
ted gray hnes represent gene-to-gene interactions supported 
by the hterature. Blue-fihed densities represent the marginal 
approximate posterior distributions found through the ABC- 
Net method. The most rigid interactions in the network con- 
nect the recA protein directly to the S.O.S. g enes, bypas sing 
the lexA master regulator. Image taken from |Rau (2010] ) 



the ABC-Net method. As previously seen in the simu- 
lation study, these posterior distributions seem to fall 
into two categories: flexible interactions (the feedback 
loops on uvrD, uvrY, ruvA, and polB) and rigid inter- 
actions (the others). That is, gene-to-gene interactions 
identified by the EBDBN with rigid approximate pos- 
terior distributions appear to be supported by substan- 
tial evidence, as those parameters are restricted to a 
smaller range of values in their posterior distributions. 
On the other hand, those associated with flexible ap- 
proximate posterior distributions may indeed represent 
false positives, since those parameters take on a wider 
range of values without negatively impacting the prox- 
imity of simulated and observed data in the ABC-Net 
algorithm. In this way, the ABC-Net method can help 
yield complementary information about speciflc gene- 
to-gene interactions, as well as the overall dynamics of 
a given biological system. That is, the ABC-Net method 
can serve as a useful reference tool to conflrm or belie 
results obtained by a more speciflc model. 

In addition to comparing the results of the EBDBN 
and ABC-Net methods, we also examine the most rigid 
approximate posterior distributions identifled by the 



latter method (Figure 11). Interestingly, all of the most 
rigid interactions in the S.O.S. DNA repair system are 
those directly connecting the recA protein to the other 
genes in the network, bypassing the lexA master regula- 
tor. This result can be explained by the one-step time 
delay inherent in the VAR(l) simulator of the ABC- 
Net method. More speciflcally, when DNA damage in 
the cell is detected by recA, the abundance of lexA 
decreases very rapidly and the remaining S.O.S. genes 
turn on almost immediately. However, time-delay mod- 
els (like the VAR(l) simulator) are only able to iden- 



tify gene-to-gene interactions that occur with a one- 
step time lag. The result of this is that in the flndings 
of the ABC-Net method, a strong link appears to occur 
directly between recA and the remaining genes in the 
network. 



5 Discussion 

Reverse engineering the structure of GRN from longi- 
tudinal expression data is an intrinsically difflcult task, 
given the complexity of network architecture, the large 
number of potential gene-to-gene interactions in typi- 
cal networks, and the small number of replicates and 
time points available in real data. In this work, we pro- 
posed a non-standard extension of the existing ABC- 
MCMC method ([Marjoram et al., 2003*) to enable infer- 



ence of GRN. Based in approximate Bayesian compu- 
tation, the ABC-Net approach enables Bayesian infer- 
ence for complex, high-dimensional networks for which 
the likelihood is difflcult to calculate. By sampling from 
the approximate posterior distributions of parameters 
involved in GRN, this method yields a wealth of infor- 
mation about the structure and inferability of compli- 
cated biological systems, particularly with respect to 
the flexibility and rigidity of network interactions. For 
the time being, the complexity of real biological systems 
and the computing time required for the ABC-Net lim- 
its its application to small networks. 



As noted by previous authors (e.g., Sisson et al.. 



2QQ7[ [Wegmann et al., 2009, ), there are a number of 
drawbacks to the ABC-MCMC algorithm. For exam- 
ple, the choice of e plays an important role in the chain; 
too large of a value for the threshold e results in a chain 
dominated by the prior distribution, while too small of a 
value leads to extremely low acceptance rates. As such, 
implementation of the ABC-Net method requires some 
user tuning. In addition, the number of steps required 
in the burn-in period and in the chain itself are also 
dependent on this threshold value. Further work is re- 
quired to fully examine the components of the ABC-Net 
method, including more efflcient network structure pro- 
posal schemes, and techniques to identify optimal data 
simulators for real data. In particular, a key aspect in 
this work is the choice of the model used to generate 
pseudo data; recent advances in using ABC algorithms 
for parameter inference and as an exploratory tool for 



model assessment (Ratmann et al., 2011) may be useful 
for this purpose. 

In this work, we have suggested the use of somewhat 
loosely deflned "flexible" and "rigid" gene-to-gene in- 
teractions to better understand the inferability of gene 
regulatory networks; additional work is required to de- 
termine an objective criterion to characterize this be- 
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havior. Although our implementation of ABC meth- 
ods for reverse engineering GRN was a first attempt to 
demonstrate the flexibility and potential of this proce- 
dure, some improvements in performance and efficiency 
can be expected from the implementation of new sim- 
ulation techniques, such as population and sequential 



O = 



Monte Carlo (Del Moral et al., 2006, 2009 Robert 



2010). Finally, a substantial advantage of the ABC- 



Net method is its capacity to analyze time-series digi- 
tal gene expression measurements (e.g., serial analysis 
of gene expression or RNA sequencing data) through a 
simple modification of the data simulator (e.g., an au- 
toregressive simulator for Poisson distribution rates). 
To this end, additional work is required to determine 
the most appropriate techniques for simulating time- 
series count data, as well as distance functions best 
adapted to time-series count data. This goal is particu- 
larly important, as the decreasing cost and refinement 
of next-generation sequencing technology ensure that 
longitudinal gene expression profiles will likely be stud- 
ied using RNA sequencing methodology in the near fu- 
ture. 



Appendix 

Let y and y"^ denote observed and simulated time-course 
expression data, and let T and P denote the number of 
time points collected and total number of genes, respec- 
tively. The Canberra, Euclidean (i^^), and Manhattan 
(L^) distances, respectively, may be defined as 



p{y*,y) 
p{y*,y) 



EE 

t=l i=l 



lytt-Vitl 
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In addition, we also apply a distance measure proposed 5. 



by Lund and Li (2009 ) tailored to multivariate longitu- 
dinal data that we refer to as the Multivariate Time- 
Series (MVT) distance. For the MVT distance, under 6. 
the null hypothesis that and Y have the same net- 
work dynamics, we define 



T-l 



1 



Yt = Oyt-i 



^ = ^ E i(yt - rM - ny + (y. - - yt)'} 



2T 

where y^ and yj" are the observed and simulated time- 
course data, y^ and yj" are the best one-step ahead lin- 
ear predictors of yt and y^, respectively, and i7 is an 
estimate of the common covariance matrix of the errors 
E. With these terms defined, the MVT distance may 
be defined as follows: 

p(y*, y) = ^ E [(y* - y*) - (y* - y* )]' x 
t=i 

X [(yt-yt*)-(yt-yt*)]- 
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